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Abstract 

We derive new tests for fixed and random ANOVA based on a thresh- 
5h olded point estimate. The pivotal quantity is the threshold that sets all 

the coefficients of the null hypothesis to zero. Thresholding can be em- 
ployed coordinatewise or blockwise, or both, which leads to tests with 
good power properties under alternative hypotheses that are either sparse 
or dense. 

Keywords: ANOVA; multiple comparisons test; mixed effects; sparsity; 
thresholding. 

4-3 1 Introduction 

Analysis of variance (ANOVA) has something in common with thresholding 
regression in that ANOVA tests a null hypothesis that some parameters are 
equal to zero, and thresholding performs model selection by setting some coef- 
J> ficients to zero. This paper exploits this link to derive ANOVA tests based on 

thresholding. 

While ANOVA and tests belong to the general knowledge of a statistician, 
\^ thresholding is a more recent concept that we now review. A simple way to 

• introduce thresholding and thresholding test is to consider the canonical regres- 

sion model 

K„ = 0„ + e„ for n = l,...,iV, (1) 

. . where the noise is independent standard Gaussian. A thresholded point estimate 

^ of parameters 6 = (9i, . . . , 0^?) is obtained by applying a function 77^ to the data 

Y = (Yi, . . . , Yn), that is, 

"5 ^ = ^a(Y) (2) 

with the property that some or all entries of the estimate are null for a large 
enough threshold A. Defining {x)+ — max(x, 0), thresholding can be performed: 



coordinatewise, for instance with soft-thresholding Donoho and John- 



stone 1994 : considering each n in turn, estimate On with 



vl°'\yn)=[l-^^Yn=:k. (3) 
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• blockwise, for instance with truncated James-Stein thresholding James 
and Stein 1961 : considering all entries of Y together, estimate 6 with 



(Y) 



(4) 



The choice of the threshold A plays an important role in the quality of 



and Hochberg 1995 



the estimation in regression (see for instance Donoho and Johnstone 


1994 , 


Donoho and Johnstone 


1995|, [Tibshirani 


1996 , Yuan and Lin 


2006 , Efron 


2004 and Zou et al. 


20071) or to control the false discovery rate Benjamini 



In this article we are interested in linear analysis of variance and testing null 
hypotheses regarding factors and continuous covariates. We derive new powerful 
tests based on a thresholded point estimate of the coefficients, and we choose 
the threshold Aq for the test to have the desired level a. Since we commented 
that thresholding can be performed coordinatewise or blockwise, we derive tests 
based on either coordinate or block thresholding. Or hybrids of both. Two 
tests are used extensively in ANOVA: Tukey's multiple comparisons test and 
Fisher F-test. The first one is related to coordinate thresholding and the latter 
to block thresholding. One goal is to combine Tukey- and Fisher-like tests. 

We illustrate the link between thresholding and testing on the canonical 
model ([ij and derive two tests for Hq : 9i = . . . ~ 9n ^ using a thresholded 
point estimate. For simplicity we assume for now unit standard deviation for 
the Gaussian noise. The two tests are: 

• based on coordinatewise thresholding ([S]): clearly, for a sample y. Ay — 
max„=i...._7V \yn\ = ||y||oo is the smallest and finite threshold that sets all 
the estimated parameters to zero. Letting F/^g be the distribution of that 
statistics under the null hypothesis, and choosing Aq = -FX ^(1 — a), the 
test that rejects Hq if Ay > Aq, or, equivalently if at least one entry of 
the coordinatewise thresholded point estimate 9{Xa) is different from zero 
has the desired level a. Here Aq = —$"-^([1 — exp{log(l — a)/N}]/2) has 
a closed form expression, otherwise one can estimate it by Monte Carlo. 

call this test a max-test for an obvious reason. 



Arias-Castro et al. 2011 



based on blockwise thresholding Q: likewise, for a sample y, Ay = ||y||2 
is the smallest and finite threshold that sets all the estimated parameters 
to zero. Under the null, that statistics Aq ^ Xn- Therefore choosing the 
threshold A^ = F~2^{1 — a), the test that rejects Hq if 9{\a) is different 
from the null vector leads to another test of level a. 



Arias-Castro et al. 2011 studied the asymptotic relative power of both tests 
under either a dense or a sparse alternative hypothesis Hi. According to their 
definition. Hi is dense or sparse whether the parameters 6 satisfy either {||0||2 > 
B} or {ll^lloo > A} for some positive lower bounds B and A, respectively. They 
proved that the max-test has more power under the sparse alternative. 

Another goal of this paper is to derive thresholding-based tests for ANOVA 
considering either a coordinate or a block thresholding strategy, and see how 
they relate to and improve on existing tests. Our tests are based on thresholding 



estimators developed for linear models, and in particular lasso Tibshirani 1996 



grouped lasso Yuan and Lin 2006 and smooth blockwise iterative thresholding 
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Sardy 2012 . We show how the threshold parameter Aq of these estimators can 



be determined for the thresholding test to have the desired level a. Lockhart 
et al. 2013| consider a sequential approach of testing the significance of the 
successive lasso coefficients. 

Section [2] starts will the simple one-way AN OVA, derives a blockwise and 
coordinatewise tests, and investigate their relative power under a dense and 
a sparse alternative. To take the best of both (dense and sparse) alternative 
worlds, Section [3] derives a single a-level test, called 0&-I- test, that is nearly 
as powerful as either the blockwise or the coordinatewise tests under both al- 
ternatives. Section |4] is concerned with Tukey multiple comparisons test, and 
proves that it amounts to a coordinate thresholding test. The latter has the 
advantage of having the exact desired level not only in the balanced situation 
(like Tukey's), but also in the unbalanced one (where Tukey's is conservative), 
albeit a Monte Carlo estimate of the threshold. Section [s] presents the general 
framework for iterative thresholding-based tests for ANOVA, for which some 
coefficients may be thresholded coordinatewise, blockwise or both, depending 
on the nature of the parameters (fixed effects, interactions, random effects) and 
on the nature of the alternative hypothesis considered (dense or sparse). Sec- 
tion [6] applies the new test to a real data set modeled by mixed effects. SectionjT] 
proposes another selection of the threshold based on an extension of the uni- 
versal threshold to satisfy both good estimation and model selection properties. 
Section [8] draws some conclusions. 



2 One-way ANOVA: two tests 

To fix notation, consider one-way ANOVA with T treatments and R replications 
Ytr + (tr, t 1, . . . , T, r = 1, . . . , i?, (5) 

where etr ' ^ N(0, cr^) and the total number of observations is = TR. In 
matrix notation, ([5]) is Y = X + e, where A is an TV x T matrix with 



Y21 



Y 



2R 



Yti 



/ 1 



and X 



\ 






1 



\Ytr J V ... 1 / 

The matrix has orthogonal columns since X^X = R It- For testing 

iJo ; /ii = . . . = /^t(= A*) against Hi : for at least one i, ^.t ^ ^. (6) 
the most common approach is Fisher's test based on the pivot 
(RSSH„-RSS)/(r-l) 



RSS/(A^ - T) 



1,N-T, 
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where Tdi.d2 is the Fisher distribution with degrees of freedom di and d2, RSS = 
||Yo - XA'^'lli with p}-^ = {X^X)-^X^Yo, RSSho = ||Yo - Yolg and Yq ~ 
N(/zl, ct^/at). Next section shows that the same test can be derived based on 
block thresholding. 

2.1 Blockwise thresholding test 

It is well known that model ^ and test ^ are equivalent to testing 

Hq : di = . . . = 6t = against Hi : for at least one t, Ot (7) 
for the model 

y^^il + Xe + e. (8) 

As opposed to ([6|, the formulation ([t]) of the null hypothesis is sparse in the 
sense that the parameters of the null hypothesis are all zero. This motivates the 
following test based on thresholding. For a given level a, the idea is to derive a 
thresholded point estimate 9{Xc) and to control the threshold such that the 
point estimate is the null vector with the desired probability 1 ~ a under the 
null hypothesis. The parameter /i is not to be tested, so we first calculate 

yA = y-PAy with Pa = A{A^ A)'' (9) 

to remove the contribution of the x 1 matrix A = 1 corresponding to fi. Here 
Pa is the projection matrix in the range of A. Then, given a positive threshold 



A and a smoothness parameter s > 1, the block threshold estimate [Sardy |2012| 



- [^-jx4ih)J^^y^ (10) 

generalizes truncated James-Stein's thresholding J4) and has the property that 
9{X) = iff A > WX'^yAh- Note that 1/R in ([lO) stems from the inverse of 
X"^X. Observing that ||X'^yyi||2/(T is a pivot leads to the following theorem. 

Theorem 1 (Block thresholding test): Consider model ([8| for which we 
test (j?! at a prescribed level a. Let Yq ^ N(/^l, g'^In) be the distribution of Y 
under Hq and let a he a positive estimate of a such that tr/cr is a pivot. Then 
Ao,2 = ||-'^^(Yo - P4Yo)||2/ct(Yo) is a pivot with distribution F^^^. Defining 



the thresholded point estimate 0(y / u[y);\a,2) in (10 1 for the observations y 
and setting the threshold \a^2 such that F9 {Xa,2) = 1 — a, then the test 



1 if 0(y/<7(y);A„,2)^O 
otherwise 



has level a. Finally, letting CT^(y) be the standard unbiased estimate of variance, 
the block thresholding test is equivalent to Fisher test with the relation A^ 2 — 
R{T — ^)F^\p_^ n-tO^ ~ '^)' where Fjr is the cdf of the Fisher distribution. 

Proof: Ao,2' - (||C/o||2/a)/(a(Yo)/a), where Uo = ^^(Yo - P^Yq) ~ 
N(0,cr2(i?/j. _ R^/NJt)) with Jt the T x T matrix of ones. So the distri- 
bution of ratio Ao,2 does not depend on (/i, a). Moreover 

^Hj{y) = P {e{^yK,2) 7^ 0) = P (^ffl^ > A„,2) = l-^fo,.(Aa,2) = 
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The equivalence to Fisher's test using the standard unbiased estimate of variance 
for cr^ is straightforward. □ 

By equivalence with Fisher's test, the distribution F^^ ^ is known when using 
the standard unbiased estimate of variance, so that the (1 — Q!)-quantile Xaa can 
be calculated from the quantile of Fisher's distribution. In the more complex 
situations we will consider in the following, F^^ ^ does not have an explicit 
expression. This is for instance the case with this test if another estimate of 
variance, say a robust one, possibly dependent of the numerator, is employed. 
In that case Xa,2 can be estimated by Monte-Carlo simulation. 



2.2 Coordinatewise thresholding test 



Instead of blocking the T treatment parameters into one block of size T and 
thresholding blockwise, thresholding could be performed on T blocks of size one, 
known as coordinatewise thresholding. For blocks of size one, smooth blockwise 



iterative thresholding Sardy 2012| defines a point estimate as a solution to a 
set of nonlinear equations 



et{x) = 1- 



t 



1. . . 



1 T 



with vt^YA-Y. 



(11) 



for a given positive threshold A and smoothness parameter s > 1. Note that 
1/R in ( 11 1 stems from the inverse of x^xj = R for all t. For s = 1, this defines 



the lasso estimate in a way that makes thresholding clearly visible: we recognize 
soft-thresholding ([s]) applied to least squares estimates on the partial residuals. 

Moreover the estimate ^(A) ~ (6*1 (A), . . . , ^t(A)) satisfies the property that 
9t{X) = for ah t = 1,. .. ,r iff A > HX^yAlloo = maxt=i,...^T Ix^^y^l- This is a 
particular case of Lemma 1 proved in Sectionjs] Observing that ||-'^"'^yyi||oo/o' 
is a pivot leads to the following theorem, which proof is similar to that of 
Theorem 1. 



Theorem 2 (Coordinate thresholding test): Consider model ([s]) for which 
we test ([t]) at a prescribed level a. Let Yq ~ N(/lj1, (T^/at) be the distribution 
of Y under Hq and let be a positive estimate of a such that a /a is a pivot. 
Then Ag^oo = 11-'^'^ (Yq — P4Yo)||oo/o'(Yo) is a pivot with distribution F^^ ^. 

Defining the thrcsholdcd point estimate 0{y/a{y)\ Aq_oo) solution to ( 11 ) for the 
observations y and setting the threshold Aq,,, 
then the test 



such that Ft 



(Act,< 



= 1 — a. 



0(y) 



has level a. 



1 if 0t(y/a(y);A„^c 
otherwise 



7^ for at least one t G {1, . . . , T} 



2.3 Power analysis of both tests under two alternatives 



We now compare the power of the two tests proposed in Sections 2.1 and 2.2 
Given a level a, the thresholds 



Xn/2 — 



RjF,\l-a) and A„, 

X.'j' 



R^- 



l-(l-a)i/r^ 



(12) 
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respectively confer the blockwise (based on the 2-norm) and coordinatewise 
(based on the oo-norm) tests the desired level. We assume here that fx and a 
are known for simplicity, but the conclusions below would hold if both /i and a 
were estimated. Arias-Castro et al. 2011 prove the asymptotic result that the 



test based on coordinate thresholding behaves differently than Fisher's test in 
terms of power depending whether the alternative hypothesis is dense or sparse. 
We consider two alternatives: a dense alternative of the form 



iJi^ :0-^(±l,...,±l), 

and a sparse alternative of the form 

i/f :0 = 0(±1,O,...,O), 

where G IR. The power of both tests as a function of 9 under both alternatives 
is reported in Table [l] Figure [l] plots power as a function of 9 for T — b 
treatments and i? = 10 replications. This corroborates the asymptotic results 
of Arias-Castro et ah] [2011 



for a dense alternative, Fisher/block-test is more 



powerful, while for a sparse alternative coordinate-test is more powerful. 



Table 1: Power of blockwise and coordinatewise thresholding tests at a level a 
under a dense and sparse alternative. Notation: Ae<i>(A;i?) — (^{{\—R9)/\/R) — 
m-X- R9)/y/R). 



block A = Aq.2 
coordinate X — Xr 



dense 
l-F^2 (AVi?) 

\-{^e^{X-R)Y 



1 



sparse 



A0$(A;i?){Ao$(A;i?)} 



3 One way ANOVA: the 0&:+ test 

The relative power of both tests calls for a single test that would be of level a 
while being as powerful as the best between the block- and coordinate-tests. 
The following test approaches that goal by defining a point estimate based on 
joint block- and a coordinate-thresholding on the same parameters. 

We now explain the notation F*^, and i^c>&+_ dimension two, the 
^2-ball employed by Fisher block-test is a circle symbolized by 'O', the two 
canonical directions employed by coordinate thresholding are the horizontal and 
vertical directions symbolized by '-|-', and so we call the joint test the 0&-I- test. 

To define the 0&-I- test, consider the concatenated matrix [X, X] and two 
sets of coefhcients and B-i- Block 0^ into one block and treat the second 
coordinatewise. For reason we will explain, rescale the matrix X into X\ = XDi 
and X2 — XD2, where Di and D2 are diagonal. Finally consider the point 
estimate 0{X) = (0i(A), ^2,1(^^)1 ■ • ■ i ^2,t(A)) defined as a solution to 

(l- jj^f^y^j^D^^X^n with ri =yA-X202(A) 
f 1 — TT^ — r I 1732— x?,r2 t with 
r2.t =YA- XiOi - Y.,^, X2,.02,»(A) i = 1, . . . , T 

(13) 
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for a given positive threshold A and smoothness parameter s > 1, where ja = 
y — Pay as before. The solution to the system is unique if s > 1 Sardy 2012| , 
and has the property that 6{\) = iff A > max{||Xj''y^||2, ||Xjyyi||oo} for 
s > 1. This is a particular case of Lemma 1 proved below, which leads to the 
following test. 




Figure 1: Power analysis of the three thresholding tests at a prescribed level 
a = .05 for T = 5 treatments and R — IQ replications: block, coordinate and 
0&+ tests. The lines are the theoretical powers of Table [l] and the dots are 
empirical probabilities estimated by Monte-Carlo. Note that each test starts at 
power a = 0.05 for = 0, as expected. 



Theorem 3 (0&+ test): Consider model Q for which we test ([7| at a 
prescribed level a. Let Yq ~ N(/xl, (t^/at) be the distribution of Y under Hq 
and let ct be a positive estimate of a such that ct/ct is a pivot. Then 

Ao.(2,oo) = max{||xT(Yo - P^Yo)||2)/<7(Yo), \\X'^{Yo - PaYo)|U/'7(Yo)} 

(14) 

is a pivot with distribution i^®^ ^ . Defining the thresholded point esti- 
mate 0{y I a{y)]\a,(2,oo)) solution to (13) for the observations y and setting 
the threshold Aq, (2,00) such that F®^ ^{^a.{2,oo)) = 1 — a, then the test 



0(y) 

has level a. 



1 if 0(y/<T(y);A„,(2,oo)) y^O 
otherwise 
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Mean— based rescaling 



Quantile— based rescaling 




Figure 2: Illustration of mean-based rescaling (left) and quantile-based rescaling 
(right) for one-way ANOVA with T — 5 treatments and R — 10 replications. 
The series of 6 boxplots represents the empirical distribution of the components 
of the pivot Aq (2.oo) defined in (14 1 under the null: the first one corresponds to 
realizations of ||A']^Yo||2 and the other 5 correspond to realizations of |xJjYo| 
for t = 1, . . . ,T. We observe that the first boxplot on the right figure has the 
advantage of having upper extremes in the magnitude of the other five. 



Following on Section 2.3 and Table [T] we consider the power of the joint-test 
under both dense and sparse alternatives as a function of 9 (again assuming 
yCt and a are known). Figure [l] shows the power function of the block- and 
coordinate-tests (curve), and estimate them as well for values of on a grid 
with a Monte-Carlo simulation. We also report the empirical power of the 
0&+ test on the same grid, which has the remarkable property of performing 
under both alternatives almost as well as the best of the two individual tests. 

To achieve with a single test a power nearly as good as the best of the two 
individual tests, rescaling the design matrix X by Di and D2 is a crucial step. 
To allow the joint-test to have the same sensitivity whether the alternative is of 
the dense or sparse type, we perform the following quantile-based rescaling: we 
let the matrix corresponding to the block coefficients 61 be Xi — XDi where Di 
is diagonal with entries 1/Xa^2', likewise we let X2 = XD2 where D2 is diagonal 
with entries for the coefficients 62 thresholded coordinatewise. The 

theoretical values of Aq.2 and Aq,_oo are known ( 12 1 in our simple setting, but in 
more complex settings, we rely on a Monte-Carlo simulation to estimate them. 
Figure [2] illustrates the advantage of quantile-based rescaling (right), as opposed 



to mean-based rescaling (left) proposed by Yuan and Lin [2006| for group lasso. 



The left plot shows that, under the null, the boxplots are centered around their 
means (horizontal dotted line); because of that rescaling, the distribution of the 
block statistics ||A]^Yo||2 (first boxplot from left) has its largest observations 
significantly lower than those of the coordinate statistics |xJ]^Yo|, . . . , jxJjnYol 
(second to sixth boxplots) . Consequently, with an alternative hypothesis of the 
dense type, the joint-test would have low power with that rescaling. Instead, 
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the right plot of Figure |2] shows how quantile rescahng centers the distributions 
of the block and coordinate statistics around their (1 — Q;)-quantile (horizontal 
dotted line), hence providing the joint test with a homogeneous sensitivity under 
both dense and sparse alternatives. 

4 Tukey multiple comparisons test 



When the null hypothesis ([6| is rejected, Tukey 1953 is interested in identifying 
which null hypotheses 

i/^*'*'^ - /if - 0, t^l,...,T, t' = t + l,...,T (15) 
caused rejection of His test is dual to intervals 

TT,N-T{a)^{y) , , _ _ , TT,N-T{a)^{y) 

Vt — Vt' , < ut — Ut' < % — Vt' H , (Id) 

based on the Studentized range distribution Tt,n-t, where ^^{y) is the unbiased 
estimate of variance. Here i?t > is the number of replication in treatment t 
(not necessarily all equal to R). The test rejects iJg*'* ' if zero is not covered by 



its corresponding interval ( |16[ ) . The level a of the test satisfies that none of the 
T(T — l)/2 tests are rejected with probability 1 — a under the null hypothesis 
([S]). Wh ile the level is exact when R^ = R for all t, the test is conservative 
[Hayter 1984 for unbalanced designs, that is when the number of replication 



Rt differs between treatments. 

Tukey's multiple comparisons test can also be derived based on thresholding 
and its level can be set to a, even when the number of replication Rt differs 
between treatments. To see that, first note that the design matrix is orthogonal 
with X'^X = diag(i?i,...,i?T)- Then let E = {X'^X)-'^X^ and A be the 
T(T — l)/2 X T matrix such that 6 = Afi are the pairwise differences. Left 
multiplying ^ be AE implies y = X5+e with X = I,y — AEy and e = AEe. 



The coordinate thresholding estimate defined in (11) can now be applied 
to that latter model. First rescaling must be performed: the diagonal — 
dia,g{a^ A EE^ A'^) of the covariance matrix of e is not constant, unless Rt — R 
for alH = 1, . . . , T. So we standardize X = I such that the marginals of X'^y 
are Gaussian with identical variance under Hq, by multiplying X by D~^. Since 
the matrix X = D^^ is diagonal, the coordinate thresholding estimate defined 



in (11) has the closed form expression 



ht.t')W = ( 1 - p t;; 1 ) d(^t,t')yit.t'), (17) 

for all pairs {t,t'). This thresholded point estimate leads to the following test. 

Theorem 4 (Coordinate thresholding test and Tukey multiple compar- 
isons test). Consider model ([s]) for which we test all null hypotheses (15) 



at a prescribed level a. Let Yq ^ N(/tl, ct^/at) be the distribution of Y 
in ^ and let ct^ be the standard unbiased estimate of the variance. Then 
Ao,co — ll-D^"' Ai?Yo||oo/o'(Yo) is a pivot with distribution . Defining the 
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Power 



Oracle property 



coordinate thresholding 
Tukey range 




Figure 3: Results of Monte Carlo simulation for Tukey's multiple comparisons 
test for one-way ANOVA model ^ with T = 5 groups and sparse alternatives 
of the form Hi : = (6*, 0,0, 0,0) with 9 G [0,6] The number of replication 
is (1,5,9,10,10) in each of the five groups. Left plot: percentage increase in 
power between the exact thresholding test (coordinate thresholding) and the 
conservative Tukey test. Right plot: average number of correct detections as a 
function of 9, the maximum possible being T — 1 — A. 



thresholded point estimate ^(y/<T(y); Aq^oo) in (171 for the observations y and 
setting the threshold \a,oo such that F^^ ^{Xa,oo) = 1 — a, then the test 

^ f 1 if <^(t,t')(y/o-(y); Aa,oo) 7^ for at least one {t,t') 
\ otherwise 

has level a. Moreover \i Rt = R for all treatments, this test is equivalent to 
Tukey's range test with the relation Tt.n-t{oi) = Aq^oo\/2, where N = RT. 

Proof: the proof that this test has level a is straightforward. For the equiv- 
alence to Tukey multiple comparisons test when Rt — R for all treatments t, 



note that on the one hand Tukey's test rejects when at least one interval ( 16 ) 
does not cover zero, or equivalently when \yt — yt' \ — TT,N-Ti<^)S'{y) / VR < 
for at least one pair {t,t'). On the other hand, the thresholding test rejects 
when \y(t,t')/d(^t,t')/^{y)\ < Aa,oo, where y^t^t') = {^Ey)(t,t') = Vt - Vt', and all 
f,-) = 2/R when R — Rf. So the thresholding test rejects when \yt — yt'\ < 

Ao!,oo<5'(y)\/2/i?. So the two tests are equivalent and Tt,n-t{oc) = Xa,ao^- □ 
We illustrate the gain in power and oracle property of the thresholding test 
in comparison to Tukey's multiple comparisons test on the same Monte Carlo 
simulation as in Section [Sj except that the number of replication in each of the 
T — b treatments is (1, 5, 9, 10, 10) instead of i?t = i? = 10 for all treatments. 
The alternative hypothesis of the form Hi : 9 = [9, 0, 0, 0, 0) is indexed by the 
parameter 9 in the range [0,6]. In that setting, Tukey's test is conservative as 
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we observe on the left plot of Figure [3] where the percentage increase of power 
is plotted as a function of 9. The right plot informs on the number of correct 
detections, with a target value of T — 1 non-zero entries guessed correctly. We 
see on the graph that the thresholding test has slighlty more correct detections 
on average than Tukey's conservative test. 

If it is not clear to the statistician whether the alternative hypothesis is 
sparse or dense, then Tukey multiple comparisons test could be turned into an 
0&:-|- test as well to have more power under both alternatives. 

Finally we considered pairwise contrasts, but the thresholding test could 
easily be implemented to test other contrasts. 



5 Thresholding test for general ANOVA models 

We now consider general ANOVA models which can be written in the form 

y = Ah + Xe^(TZ with I = . . . , 0q), (18) 

[ Z^N(0,/Ar). 

The matrix A corresponds to the nuisance parameters b. In the one-way 
ANOVA considered in the previous section, A — 1 and h is the intercept co- 
efficients, but A can include a large number of parameters we do not want to 
test. The N x P matrix X corresponds to the parameters of interest, and is 
the horizontal concatenation of matrices Xi, . . . , Xq corresponding to effects 
6i, . . . ,6q, each of respective size Pq such that Pq = P- For instance in a 

one-way ANOVA plus random effects, 6i are the main and O2 random effects, 
as in the application of Section [6] Another example of concatenated matrices is 
the joint test of Section [S] with coefficients (^i, 6'2i, . ■ . , 6'2t) with corresponding 
matrices X = [X1X21 . . . X2t]- We assume that matrices A and Xi, . . . , Xq are 
all full rank; X needs not be full rank. Moreover we assume the space of the 
nuisance parameters are not too large, in the sense that all Xq must have some 
components outside the range of A. 

Our goal is to test the null hypothesis ([7|, that is test 

ffo : ^1 -0, ...,0Q -0 (19) 

at a desired level a. To derive a threshold-based test, we must define a point es- 



timate that thresholds and is uniquely defined. Sardy 2012 guarantees unique- 



ness of a thresholding estimator with a linear and invertible reparametrization 



of ( 18 ) into 



X = [Xi...Xq], 
y = Ah + X-f + a2 with \ 7 = (7^, . . . , 7^), (20) 

z^N(0,/jv), 

where each Xq must satisfy the orthogonality condition: XjA"g = d^Ip^ with 
dq > 0. Group lasso is also defined with this condition Yuan and Lin 2006 



but not necessarily uniquely. Orthogonalization can be achieved with a QR or 



SVD decomposition, for instance. Testing ( 19 1 is equivalent to testing 

Ho: 7i=0, ...,7q = 0. (21) 
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For a given threshold A > and a smoothness parameter s > 1, we introduce 
the thresholded point estimate 7 defined as a solution (not necessarily unique 
unless s > 1) to the following nonlinear system 



with Tg^YA- J2q'^q^q'lq'Wi <? ^ Qbiock 

%,tW = {l-j^r^)^Scl,rg,t/dq 

with r^.t =yA- X_(q.t)7_^ ^{X), qE Qcoord, t = ^,---,Pq 

(22) 

where is defined in ^ and Qbiock are the indexes of blocked variables (resp., 
Qcoord for variables thresholded coordinatewise) , and X„(^ is the matrix X 
without column t of block q |Sardy[ 2012| . This point estimate has the following 
important property. 



Lemma 1: Considering system (22 1 for given s > 1 and A > 0, then 



7 (A) = for aU g e Q 

block ^ =sicoord 

(23) 

if and only if 

A > max{ max H^Jy^lU, max HXjy^Hoo}. (24) 

<?e Qbiock ge Qcoord 

In this case, 7(A) is uniquely defined. 

Proof: the implication is straightforward. For the converse, the choice of 



A in ( 24 1 implies is a solution. The proof is complete if the zero solution 



is unique. When s > 1, Sardy 2012 proved uniqueness of the solution to 



(22 1. When s = 1, (22| are the first order optimality conditions to the hybrid 



lasso-grouped lasso defined as solution to 

min^l|yA-^7ll2 + -A||7llQ, 
■y i 

where |17|1q = Eggg^ock H'^Jb + EqGQ,„„,<j \\lq\\\ is a hybrid-norm. This cost 
function C(7) of the above minimization problem is convex in 7, and the solu- 
tion may not be unique in that case. In fact if two solutions 7]^ and 72 exist 
then their convex combinations 7^ = 8-^-^ + (1 ^ '5)72 are also solutions, that is 
C(7^) = C(72) = C(75) = C* for aU i e [0, 1]. But suppose X71 ^ X72, then 
the strict convexity of || • II2 and convexity of the Q-norm imply that 

Ch,) = ^ll%A-^7i) + (l-<5)(y^- 172)11^ + A||57i + (l-5)(7)2||e 

< \{i\\YA - X7i||^ + (1 - <5)||y^ - I72II2} + A(<5||7iIIq + (1 - 
= C*. 

This contradiction implies that any two solutions must satisfy X'^^ — ^72 j 
and have the same least squares cost. So their penalty term must be equal: 
''^II7iIIq = '^II72IIq- Since 7^ = is a solution, its Q-norm is zero. Then 
necessarily 72 = 0. □ 

The following theorem proposes a test of level a and shows it is a thresholding 
test since it amounts to testing whether a thresholded point estimate is null or 
not. Its proof is based on Lemma 1. 
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Theorem 4 (Thresholding test for ANOVA): Consider model (18) for which 



we test ( 19 1 at a prescribed level a. Assume A is full column rank and let Pa 
be the projection in the range of A. Let Yq ^ N(Ab,(T^/Ar) be the distribution 
of Y under _ffo and let o-^ be the standard unbiased estimate of variance. Then 

:yo) 

(25) 



Ao = max{ max ||Xj(Yo - PaYo)||2, max ||Xj(Yo - PaYo)||oo}/'7(Yo) 



is a pivot with distribution Pao- Letting = Ff^^{l — a), then the test 

{1 ifAa<max{ maxgeg^,,^^J|Xj(y - PAy)||2/(T(y), 
max^GQ.oo.d \\Xq {y - PAy)\\oolcr{y)} 
otherwise 

has level a. Moreover defining the thresholded point estimate 7(y/<T(y); Aq.) as 



a solution to ( 22 ) for the observation y, then the test 



^^^> 1 otherwise 



Rescaling the block matrices Xq after orthonormalizing them is a crucial 
step as we illustrated in Section |3] Quantile rescaling allows the Q blocks to 



contribute equally to the distribution of the pivot Aq in (25), regardless of 
their sizes. Quantile rescaling is defined as follows. Given a block q and the 
orthonormalization Wq of Xq (with QR or SVD), quantile rescaling applies the 
same factor to Wq and leads to the rescaled matrix: 

• Xq — Wqdq, whcrc l/dq = \^^\ IS the (1 — a)-quantile of the distribu- 
tion of ||W^yyi||2 under the null, if the corresponding coefhcients 9 are 
thresholded blockwise; 

• Xq = Wqdq, wherc l/dq = Xa^oo IS the (1 — a)-quantile of the distribu- 
tion of II VF^y^illoo under the null, if the corresponding coefficients 9 are 
thresholded coordinatewise. 

In the case P > N, we propose the following estimate of the standard devi- 
ation a. Let X UDV^ be the singular value decomposition of X, the design 

TO " LS ,1-1^ " LS 

matrix of rank R < N. Then 7^^ = D$ - N{'y,a^lR), where is the 
least squares estimate with the transformed matrix XV, the matrix of principal 
component regression. In eigen directions of small singular values dr, the true 
coefficients 7^ should essentially be zero. So we propose to estimate the standard 
deviation with a — MAD(|7p^^|, . . . , for po large enough, say po = L^/2J . 

If P is prohibitively large to prevent an SVD, then its columns can be sampled 
to create sample matrices of a reasonable size, and repeated estimations of a 
can then be aggregated into one. Note that this resampling procedure should 
be reproduced under the null to determine the appropriate threshold. 



6 Application 

We illustrate the thresholding test on a real data set modeled with mixed-effects 



Pinheiro and Bates 2000 . The effort j/y required (on the Borg scale) to arise 
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from a stool is measured for J = 9 different subjects each using 1 = 4 different 
types of stools. A linear mixed-effects model can be written as 

= ea + xei + ze2 + e,j, 

where X model the fixed effects for Types (4 columns) and Z models the random 
effect for Subjects (9 columns). The noise is assumed i.i.d. N(0,ct^) and 02 is 
believed to be independent realizations from N(0,CT2). The goal is to test 

Ho ■■ 6*1,1 = ^1,2 = ^1,3 = 6*1,4 = and as = 0, 

or equivalently 

Ho ■■ 6*1.1 = 6*1,2 = 6*1,3 = 6*1,4 = and 6»2 = 0. 

So we employ the thresholded test coordinatewise for the fixed effects and block- 
wise for the random effect. Table [2] reports the result at a level a = 0.05. The 
joint coordinate and block thresholding test rejects the null hypothesis because 
Oi and 62 are both declared significantly different from zero. Moreover the test 
provides the information that level 3 of the type of stool is not significant. 

After choosing the contrast J2i=i = Oj the Ime procedure available in R 
also declares 172 significantly different from zero, and ^1,3 not significant. 



Table 2: ErgoStool data: thresholded estimate at a level a = 0.05. The four 
fixed effects are thresholded coordinatewise and the random effects blockwise. 



Fixed 


Oo 




^1,1 


^1.2 


6*1,3 


01,4 










10.2 




-0.81 


1.38 





-0.14 








Random 


6*2,1 


02.2 


02,3 


62.4 


6*2,5 


02,6 


6*2,7 


02,S 


6*2,9 




0.52 


0.52 


0.11 


-0.30 


-0.50 


-0.03 


0.11 


-0.57 


-0.10 



7 Extension 



Instead of pure testing, Yuan and Lin 2006 are interested in model selection 



with good mean squared error performance. To that aim, the universal threshold 
of Donoho and Johnstone 1994| can be adapted to our thresholding procedure. 
Recall that for wavelet smoothing, the matrix X is orthonormal and the uni- 
versal threshold A = ^/2^ogN has the property to recover the tr ue zero-vector 



with high probability since P(0Ajv — ^ I -^o) ~ 1 — l/V"" logiV. Donoho et al. 
1995 also showed minimax results for this choice of threshold. In the situation 



where X is not orthonormal but is rather a complex ANOVA matrix, one can 
define the quantile universal threshold. 



Definition: Consider model ( 18 ) where 6 is segmented into Q groups. The 



quantile universal threshold (QUT) for the point estimate ( 22 ) is Xq such that 



■Pao(-^q) = 1 - " with a = l/^7rlog(3. 



(26) 



where F^g is the null distribution of the smallest threshold Aq defined in (25) 
that sets to zero all estimated coefficients when the true ones are null. 
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We investigate the model selection and predictive performance of employing 



the quantile universal threshold ( 26 ) to threshold the parameters of the linear 
model (18 1. We consider Models III and IV used by Yuan and Lin 2006 to 
compare various estimators. Letting e ^ N(0, 2^), then 

• Model III has 2 factors out oiQ — 16: let Zi, . . . , Ziq, W be i.i.d. standard 
Gaussian and Xi = {Zi + W) /V2, then generate 100 samples from 



r = {x 



X3} + {^Xl~Xl + ^Xe} + e, 



There is a total of 16 groups of size 3. 

• Model IV has 3 factors out of Q = 20: let Xi, . . . , X20 be generated as in 
Model III, and let Xn, . . . ,X2o be trichotomized as 0, 1 and 2 if smaller 
then $^^(1/3), larger than $^^(2/3) or in between, then generate 100 
samples from 



Y = {Xl+Xl+Xs}+{^Xl-Xi + -Xe}+{^I{Xu = 0)+/(Xu = l)}+e, 

There is a total of 20 groups, half of size 3 and half of size 2. 

We compare the performance of three estimators, least squares, grouped lasso 
and smooth blockwise iterative thresholding ( 22 ) for s = 1 (SBITE) and quantile 



rescaling based on three criteria: the number of factors selected, the MSE on 
the coefficients 0, and, of marginal interest for ANOVA, the predictive MSE 
on fi — X9. We estimate these quantities by taking the average over 200 runs 
of a Monte-Carlo simulation. The results are summarized in Table [3 The 



selected threshold A for group-lasso is either Cp or oracle Yuan and Lin 



The SBITE estimator uses the quantile universal threshold (261 instead. The 



2006 



empirical results point to the excellent performance of SBITE with the quantile 
universal threshold both for the estimation of the number of factors and the 
MSE of the estimated ANOVA coefficients. 



8 Conclusions 

Thresholding tests alleviate two problems of standard ANOVA tests: they do 
not require to specify types of contraint and they do not require exact knowl- 
edge of the distribution of complex pivots but simply require an estimate of the 
critical value, for instance by Monte Carlo. For the first time, block and coor- 
dinate thresholding is employed jointly to combine tests of various natures on 
the same parameters and therefore increase the power of the test under different 
alternatives. Hence, observing that Fisher's test comes from block threshold- 
ing and Tukey's test comes from coordinate thresholding, we filled a possible 
continuum between these two tests by developing hybrid tests based on £2- and 
^00-norms, essentially tests based on combined F- and i-tests. More generally 
£p-tests could be derived. 

How to put variables into groups is the choice of the statistician based not 
only on the nature of the parameters (fixed or random effects, main effects, 
interaction effects) but also on the type of alternative hypothesis he or she 
wants to test (sparse or dense). 
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Table 3: Results of Monte-Carlo simulation of Yuan and Lin 
p. 61]. In bold, the best between Cp and QUT. 



2006 Table 1, 





Least squares 


Group lasso 
(Cp/oracle) 


SBITE 
(QUT) 


Model III 








Estimated number of factors 








out of 16. True=2. 


16 


11/7.5 


3.7 


Model error 








on 6 


7.2 


1.5/0.7 


0.6 


on xe 


7.5 


1.5/0.9 


1.4 


Model IV 








Estimated number of factors 








out of 20. True==3. 


20 


15/10 


5.2 


Model error 








on e 


15 


3.4/2.1 


2.9 


on xe 


5.7 


1.6/1.1 


2.0 



By deriving new tests based on thresholding in a linear ANOVA setting, 
this paper is the extension and practical implementation of group lasso and the 
max-test. The level of the test can be set to any desired level, which was not 
addressed by the original group lasso. We also showed that the proposed quantile 
rescaling is crucial to insure that parameters are democratically represented in 
the test whether they belong to a block of large or small size. 

Combining dependent tests of various natures could be done for the higher 
criticism and false discovery rate approaches as well, by extending the work of 
Donoho and Jiashun 2004 and Benjamini and Hochberg 1995 with p-values 



related to dependent t-tests and F-tests. 
R code is available upon request. 
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